Modelling the Internal Variability of MWEs
نویسنده
چکیده
The issue of flexibility of multiword expressions (MWEs) is crucial towards their identification and extraction in running text, as well as their better understanding from a linguistic perspective. If we project a large MWE lexicon onto a corpus, projecting fixed forms suffers from low recall, while an unconstrained flexible search for lemmas yields a loss in precision. In this talk, I will describe a method aimed at maximising precision in the identification of MWEs in flexible mode, building on the idea that internal variability can be modelled via so-called variation patterns. I will discuss the advantages and limitations of using variation patterns, compare their performance to that of association measures, and explore their usability in MWE extraction, too.
منابع مشابه
The PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions
Multiword expressions (MWEs) are known as a “pain in the neck” for NLP due to their idiosyncratic behaviour. While some categories of MWEs have been addressed by many studies, verbal MWEs (VMWEs), such as to take a decision, to break one’s heart or to turn off, have been rarely modelled. This is notably due to their syntactic variability, which hinders treating them as “words with spaces”. We d...
متن کاملAutomated Acquisition of Multiword Expressions for Robust Deep Parsing
In this presentation, I mainly deal with automated acquisition of Multiword Expressions as a means of enhancing robustness of lexicalised grammars used in robust deep parsing for real-life applications. Specifically, I begin by taking a closer look at the linguistic properties of MWEs, in particular, their lexical, syntactic, as well as semantic characteristics. The term Multiword Expressions h...
متن کاملA Repository of Variation Patterns for Multiword Expressions
One of the crucial issues in the analysis and processing of MWEs is their internal variability. Indeed, the feature that mostly characterises MWEs is their fixedness at some level of linguistic analysis, be it morphology, syntax, or semantics. The morphological aspect is not trivial in languages which exhibit a rich morphology, such as Romance languages. The issue is relevant in at least three ...
متن کاملUsing geostatistical and deterministic modelling to identify spatial variability of groundwater quality
The main portion of water demands of arid regions like Kashan Plain, Iran supply by groundwater wells. This research was conducted to assess the groundwater quality as well as modelling and mapping groundwater quality in the study area using geosatistics and deterministic techniques. Five water quality parameters, including Electrical Conductivity, Sodium Adsorption Ratio, Total Hardness, ...
متن کاملTowards an Empirical Subcategorization of Multiword Expressions
The subcategorization of multiword expressions (MWEs) is still problematic because of the great variability of their phenomenology. This article presents an attempt to categorize Italian nominal MWEs on the basis of their syntactic and semantic behaviour by considering features that can be tested on corpora. Our analysis shows how these features can lead to a differentiation of the expressions ...
متن کامل